Skip to content

Conversation

@hannesrudolph
Copy link
Collaborator

@hannesrudolph hannesrudolph commented Jun 10, 2025

Description

Fixes #4523

This PR adds support for DeepSeek R1 models when using the Chutes provider. Previously, when using DeepSeek R1 models via Chutes, the reasoning format wasn't recognized, causing reasoning blocks to be merged with regular content and degrading model performance.

Changes Made

  • Modified BaseOpenAiCompatibleProvider to expose the client property as protected instead of private, allowing subclasses to access the OpenAI client
  • Enhanced ChutesHandler to:
    • Detect DeepSeek R1 models by checking if the model ID starts with "deepseek-ai/DeepSeek-R1"
    • Parse reasoning chunks separately by handling delta.reasoning in the stream
    • Apply R1 format conversion for message formatting
    • Set appropriate temperature (0.6) for DeepSeek models
  • Migrated tests from Jest to Vitest format and added comprehensive tests for DeepSeek R1 functionality

Testing

  • All existing tests pass
  • Added tests for DeepSeek R1 reasoning format handling
  • Added tests for temperature settings
  • Manual testing completed:
    • Verified reasoning chunks are parsed separately
    • Confirmed R1 format is applied correctly

Verification of Acceptance Criteria

  • DeepSeek R1 models are properly detected when used via Chutes
  • Reasoning chunks are parsed separately and not merged with regular content
  • The R1 format is correctly applied for message formatting
  • Appropriate temperature settings are used for DeepSeek models

Checklist

  • Code follows project style guidelines
  • Self-review completed
  • Comments added for complex logic
  • Documentation updated (if needed)
  • No breaking changes
  • All tests pass
  • Linting checks pass
  • Type checking passes

Important

Adds support for DeepSeek R1 models in Chutes provider, handling reasoning formats and temperature settings, with tests migrated to Vitest.

  • Behavior:
    • ChutesHandler now detects DeepSeek R1 models by checking if model ID starts with "deepseek-ai/DeepSeek-R1".
    • Parses reasoning chunks separately using delta.reasoning in the stream.
    • Applies R1 format conversion for message formatting.
    • Sets temperature to 0.6 for DeepSeek models.
  • Code Changes:
    • BaseOpenAiCompatibleProvider: client property changed from private to protected.
    • ChutesHandler: Implements createMessage() to handle DeepSeek R1 models with <think> tags.
    • getModel() in ChutesHandler adjusts temperature for DeepSeek R1 models.
  • Testing:
    • Migrated tests from Jest to Vitest.
    • Added tests for DeepSeek R1 reasoning format and temperature settings in chutes.spec.ts.

This description was created by Ellipsis for 9810152e6065c25dd3556866edb981515f7b9c3d. You can customize this summary. It will automatically update as commits are pushed.

@hannesrudolph hannesrudolph requested review from cte, jr and mrubens as code owners June 10, 2025 22:40
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Jun 10, 2025
Copy link
Member

@daniel-lxs daniel-lxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the tests are being migrated to use .spec due to a monorepo.md rule.

It should also handle the Deepseek Chimera model.

LGTM

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jun 10, 2025
@daniel-lxs daniel-lxs moved this from Triage to PR [Needs Review] in Roo Code Roadmap Jun 10, 2025
@mrubens
Copy link
Collaborator

mrubens commented Jun 12, 2025

Looks like some test conflicts unfortunately. Do you mind updating and I'll take another look at approve once it's done?

hannesrudolph and others added 4 commits June 11, 2025 23:01
- Modified BaseOpenAiCompatibleProvider to expose client as protected
- Enhanced ChutesHandler to detect DeepSeek R1 models and parse reasoning chunks
- Applied R1 format conversion for message formatting
- Set appropriate temperature (0.6) for DeepSeek models
- Migrated tests from Jest to Vitest format
- Added comprehensive tests for DeepSeek R1 functionality

This ensures reasoning chunks are properly separated from regular content
when using DeepSeek R1 models via Chutes provider.
@daniel-lxs
Copy link
Member

@mrubens
The conflicts are solved

protected readonly options: ApiHandlerOptions

private client: OpenAI
protected client: OpenAI
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this one necessary to change?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, I see now

@mrubens mrubens merged commit a851ffb into main Jun 12, 2025
12 checks passed
@mrubens mrubens deleted the 4523-2 branch June 12, 2025 15:39
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Jun 12, 2025
@github-project-automation github-project-automation bot moved this from PR [Needs Review] to Done in Roo Code Roadmap Jun 12, 2025
cte pushed a commit that referenced this pull request Jun 24, 2025
* feat: Add DeepSeek R1 support to Chutes provider (#4523)

- Modified BaseOpenAiCompatibleProvider to expose client as protected
- Enhanced ChutesHandler to detect DeepSeek R1 models and parse reasoning chunks
- Applied R1 format conversion for message formatting
- Set appropriate temperature (0.6) for DeepSeek models
- Migrated tests from Jest to Vitest format
- Added comprehensive tests for DeepSeek R1 functionality

This ensures reasoning chunks are properly separated from regular content
when using DeepSeek R1 models via Chutes provider.

* feat: Enhance DeepSeek R1 support with <think> tag handling in Chutes provider

* fix: Correct temperature retrieval in ChutesHandler to use model's info

* fix: Update condition for DeepSeek-R1 model identification in createMessage method

---------

Co-authored-by: Daniel Riccio <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request lgtm This PR has been approved by a maintainer PR - Needs Review size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants